home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Power Tools 1993 November - Disc 1
/
Power Tools Plus (Disc 1 of 2)(November 1993)(HP).iso
/
sys
/
csover
/
csover07.txt
< prev
next >
Wrap
Text File
|
1992-05-01
|
15KB
|
320 lines
Unit 7 High Availability Solutions
Purpose
This unit describes how HP's high-availability solutions can assist the
high-end customer in maximizing data and system availability.
Objectives
At the end of this unit, you will be able to:
o Explain why high availability is particularly important to high-end
customers.
o List the major high-availability solutions available from HP.
o Describe why HP high-availability solutions are superior to
competitive solutions.
o Describe key points in HP's vision for the future of high
availability.
Introduction
High availability is a critical aspect of systems management that
ensures the consistent and dependable availability of both data and
systems. High availability means that access to mission-critical data
and applications is maximized by:
o Minimizing unplanned down-time
o Minimizing or eliminating planned down-time
High-availability solutions protect a business's ability to function.
This is particularly true with high-end customers because:
o Minutes or hours of down-time may mean thousands of dollars in lost
productivity, lost revenue, and increased expenses.
o High-end configurations are larger, so the impact of unavailable data
and resources affects a larger community of users.
o The significant investment in hardware warrants a highly available
environment.
o Operations may be round-the-clock, and any unavailability has a
significant effect.
In high-end environments, system down-time cannot be tolerated.
HP Solutions
HP Strategy
HP offers a range of products that provide increasing levels of high
availability, enabling customers to tailor the best solutions for their
environments. The focus is on minimizing both planned and/or unplanned
down-time.
[Figure: High Availability in the Data Center, caption: none]
Key Messages
HP continues to provide industry-leading hardware reliability with PA-
RISC and VLSI technologies. Today, HP's Corporate Business Systems build
on this foundation by providing built-in high-availability features in
the hardware.
HP's software is reliable and is subjected to rigorous quality
testing. In addition, MPE/iX provides a comprehensive software
resiliency strategy that anticipates software failures before they occur
and takes action to further improve system up-time.
In addition to high reliability, HP offers solutions to further
enhance data integrity and availability, system availability, and fault
and disaster tolerance. HP also has solutions to address the various
causes of unplanned failures.
[Figure: Causes of Unplanned Failures, caption: none]
Highly Reliable Hardware and Software
Reliable Hardware
For the HP 3000 and HP 9000
The HP 3000 and HP 9000 Corporate Business Systems incorporate high-
availability features in their error correcting circuitry, memory
arrays, I/O channels, and main processor memory bus. Easy
deconfiguration of failed CPUs, memory, or I/O interface modules enable
Corporate Business Systems to achieve a maximum hardware up-time 99.97%.
Also included are standard features such as automatic system recovery
with powerfail (battery backup).
Reliable Software
Both the MPE/iX and HP-UX operating systems have been designed to offer
a superior level of software reliability.
System Boot-Up
During the system boot-up process, if an error occurs while mounting
user volumes, MPE/iX and HP-UX will identify the problem volumes and
continue to mount the others. The systems also provide auto
configuration at system startup.
For the HP 3000
MPE/iX is solid even under heavy loading and is a bullet-proof operating
environment.
Try/Recover Routines
For example, MPE/iX Release 4.0 has been protected in 24 additional ways
from system failures. In addition, try/recover routines, which have
always existed in MPE/iX, have been implemented in even more places for
even higher resiliency. Try/Recover routines enable MPE/iX to recover
from error conditions without failing the system. This ongoing
commitment to software excellence makes MPE/iX the most robust
commercial operating system in the industry, next to MVS/ESA.
Table Monitor
Another bullet-proof capability comes with the use of a Table Monitor.
This feature, available in Q4CY92, monitors table usage and proactively
responds to prevent tables from being exceeded. This feature, combined
with new larger table sizes, will contribute to making the HP 3000 even
more available to users on a daily basis.
Aggregate Parallel Recovery
MPE/iX Release 4.0 incorporates a new feature called Aggregate Parallel
Recovery (APR). This enables system and user volumes to be recovered in
parallel rather than serially. This speeds up the system bootup process,
increasing system up-time.
For the HP 9000
HP-UX is the most mature commercial UN*X operating environment in the
industry. Extensive testing is done on all major releases to ensure
reliability and resiliency.
Data Integrity and Availability
Both the HP 3000 and HP 9000 offer a variety of Data Integrity
solutions.
For the HP 3000 and HP 9000
RAID
The use of Redundant Arrays of Inexpensive Disks (RAID), or parity disk
arrays, provide added measures of data availability and recovery in the
event of a disk failure. The use of disk arrays is very common in high-
end environments.
[Figure: High Data Availability, caption: none]
Disk Mirroring
Software disk mirroring further increases the availability of disks
beyond disk arrays. A disk array will not remain available if the array
controller, interface card, or power supply fails. With mirroring, data
is also available from another array with a functioning controller,
interface card, or power supply.
SCSI Mirroring
Mirroring of SCSI disk drives is supported on MPE/iX 4.0 and HP-UX 9.0.
For the HP 3000
MPE/iX Transaction Manager
MPE/iX features an integrated built-in transaction manager which
automatically and transparently provides data integrity for databases,
indexed (keyed) sequential access method (KSAM) files, system and file
system directories, and other critical system tables.
In the event of a system failure or extended power outage, the
contents of memory, databases, files, and critical system tables are
automatically and transparently restored to a state of data integrity by
the transaction monitor.
NetBase Shadow
Disk mirroring and disk sharing across a network are now available with
the NetBase Shadow feature, enhancing data availability on the HP 3000.
This feature differs from Mirrored Disk/iX in that it can mirror to more
than 12 sites, and across a WAN. Any mirrored set of disks can be used
for on-line backup purposes.
For the HP 9000
Logical Volume Manager
Because of HP's strong commitment to standards, OSF's Logical Volume
Manager (LVM) is supported on HP-UX. LVM mirroring maintains up to three
copies of data on separate disks. Disks in a mirrored pair or triplet
can be taken off-line for backup while applications continue to access
data on-line. LVM also provides the capability for files (maximum 2
Gbytes) to span multiple physical volumes, improving performance and
availability.
System Availability
For the HP 3000 and HP 9000
SwitchOver
Response to a failure at the system level should be automatic,
quickly returning operations back to normal. HP's solutions, such as SPU
SwitchOver/iX for MPE/iX and SwitchOver/UX for HP-UX, provide near-
continuous operation of mission-critical computing environments.
SwitchOver/UX provides for automatic fault detection and recovery of the
failed SPU.
HP Support Watch HP Predictive Support
HP also provides services to minimize unplanned system down-time. HP
Support Watch for HP-UX and HP Predictive Support for MPE/iX monitor the
system to proactively detect and report hardware faults to the Response
Center and system operator before they cause a failure.
For the HP 3000
AutoRestart/iX
With MPE/iX release 4.0, AutoRestart/iX (included with HP 3000 CS DX)
has been enhanced. Compression is available to reduce the amount of disk
space needed for a dump, and the need to dedicate an entire disk for
dumps has been eliminated. In addition, a toggle has been added to
enable customers to turn the autoboot feature in AutoRestart/iX on or
off.
Disaster Tolerance
Natural disasters that interrupt the availability of systems and data
can ruin an entire business. High-end data center managers are under
pressure to have disaster recovery plans.
For the HP 3000
NetBase
NetBase for MPE/iX offers wide-area disaster recovery. By automatically
maintaining copies of data throughout a geographically dispersed
network, NetBase Shadowing ensures the availability of the data in the
event of a natural disaster. If a system on the network should go down,
one command can redirect file access to an alternate computer, bringing
an unavailable application back on-line in a very short period of time.
For the HP 3000 and HP 9000
Disaster Recovery
HP also offers Disaster Recovery Services to allow customers to plan for
a natural disaster. In the event of a disaster, the HP Backup service
provides around-the-clock access to fully operational configurations.
[Figure: US map, caption: none]
Competition
Reliable Hardware and Software
Although IBM, competing againstIBM's 3090/ES9000 is perceived to be
extremely reliable, HP's systems provide higher availability due to a
simpler design with fewer parts (the HP 3000 has won DataPro's
reliability rating for years). Water cooling on IBM's 3090, for example,
introduces additional points of failure. Since HP systems are air-
cooled, these points of failure do not exist.
MVS/ESA, as software, provides a high degree of fault resilience to
system failure. In the event of a near-fatal fault, software errors are
trapped via Functional Recovery Routines and are worked around
dynamically.
MVS/ESA offers more granularity in the way it shuts down, but
applications must be specifically coded to take advantage of it. For
example, IBM recommends dedicating one database per application. If the
database fails, only that one application is unavailable. However, it
creates additional programming work and overhead to enable the databases
to interact.
MPE/iX Release 4.0 incorporates many new software resiliency features
to enhance system up-time. It is the most reliable commercial operating
system next to MVS/ESA. HP-UX is the most mature and reliable UNIX
operating environment in the industry.
Data Integrity and Availability
Both HP's MPE/iX and IBM's MVS/ESA provide a high degree of data
integrity through facilities such as transaction logging. Transaction
logging allows for data to be recovered from a log file in the event of
a "soft" failure (using rollback recovery), or a "hard" failure (using
rollforward recovery).
MPE/iX has a key advantage in that it provides this functionality as
an integral part of the operating system. Complete transaction
management occurs with all applications transparently. MVS/ESA provides
this functionality however, it is only implemented via its various OLTP
teleprocessing (TP) monitors, CICS, or IMS/DC. OLTP applications are
dependent on these additional subsystems for transaction logging. While
virtually all MVS/ESA environments utilize these subsystems, having to
do so adds to the complexity (and cost) of the system they must manage.
MPE/iX provides the most complete, transparent, inherent transaction
manager for consistent data integrity.
MPE/iX is the most robust commercial operating system in the industry
next to MVS.
HP and IBM are on a parity level with regard to their disk mirroring
and disk array functionality. (In fact, Storage Tek uses HP's disk
technology in its "Iceberg" intelligent storage subsystems. HP is
investigating Iceberg support for a later date.) MPE/iX has an
advantage over IBM in being able to mirror disks across a WAN with
NetBase.
System Availability
IBM and HP are on a parity level with regard to SPU switchover
functionality. HP is also on a parity level with DEC clusters. HP has an
advantage over the AS/400 with regard to system availability for several
reasons. The AS/400 has no transaction manager. After a system failure,
disks must be reloaded, otherwise the data is not recovered. This can
take more than 8 hours! In addition, the AS/400 does not have a
switchover product.
Disaster Tolerance
The HP 3000 has an advantage over DEC, competing againstDEC and IBM with
regard to wide-area disaster tolerance. IBM cannot mirror disks over a
WAN, so a remote IBM system cannot take over for a failed primary SPU.
DEC has the same problem. And DEC VAXclusters don't provide the WAN
feature required for disaster recovery (because the cluster must reside
at one location).
HP's Vision for the Future
HP will incorporate new hardware and software design features to provide
even greater availability in the future and further reduce the potential
for system failure. These features also will help minimize or eliminate
planned down-time for system maintenance.
[Figure: The Drive Towards Continuous Availability, caption: none]
Planned hardware enhancements include:
o On-line replacement for critical elements, including datacomm links,
I/O buses, and I/O interfaces
o On-line CPU and memory replacement for future HP multiprocessing
systems
o Hardware resiliency features that prevent component failures from
causing a hard crash
o "Memory de-allocation" to allow a memory board to detect correctable
errors that usually precede a failure and take itself off-line.
o "Graceful degradation" to enable a processor board (in a multi-
processing system) to anticipate its own failure and take itself off-
line without bringing the system down
Software enhancements will include "bullet-proofing" all MPE/iX
subsystem applications to further enhance software resiliency. Software
resiliency features are also planned for future releases of HP-UX to
reduce the root cause of system panics.
Index
Aggregate Parallel Recovery (APR) 7-5
AutoRestart/iX 7-7
DEC, competing against 7-9
Disaster Recovery Services 7-8
Disk mirroring 7-6
IBM, competing against 7-8
Logical Volume Manager 7-7
MPE/iX 7-3, 7-5
NetBase 7-8
NetBase Shadow 7-7
Predictive Support 7-7
Redundant Arrays of Inexpensive Disks (RAID) 7-5
SPU SwitchOver/iX 7-7
Support Watch 7-7
SwitchOver/UX 7-7
Table Monitor 7-5
Transaction manager 7-6
Associated files: U2-06.HPG, WBDB03.GAL, WBDB02.GAL, WBDB01.GAL,
WBDB04.GAL, U2-06.HPG, WBDB03.HGL, WBDB02.HGL, WBDB01.HGL, WBDB04.HGL,
7.doc
Unit 7 High Availability Solutions